Predicting the future health information of patients from the historicalElectronic Health Records (EHR) is a core research task in the development ofpersonalized healthcare. Patient EHR data consist of sequences of visits overtime, where each visit contains multiple medical codes, including diagnosis,medication, and procedure codes. The most important challenges for this taskare to model the temporality and high dimensionality of sequential EHR data andto interpret the prediction results. Existing work solves this problem byemploying recurrent neural networks (RNNs) to model EHR data and utilizingsimple attention mechanism to interpret the results. However, RNN-basedapproaches suffer from the problem that the performance of RNNs drops when thelength of sequences is large, and the relationships between subsequent visitsare ignored by current RNN-based approaches. To address these issues, wepropose {\sf Dipole}, an end-to-end, simple and robust model for predictingpatients' future health information. Dipole employs bidirectional recurrentneural networks to remember all the information of both the past visits and thefuture visits, and it introduces three attention mechanisms to measure therelationships of different visits for the prediction. With the attentionmechanisms, Dipole can interpret the prediction results effectively. Dipolealso allows us to interpret the learned medical code representations which areconfirmed positively by medical experts. Experimental results on two real worldEHR datasets show that the proposed Dipole can significantly improve theprediction accuracy compared with the state-of-the-art diagnosis predictionapproaches and provide clinically meaningful interpretation.
展开▼
机译:根据历史电子健康记录(EHR)预测患者的未来健康信息是个性化医疗保健发展中的一项核心研究任务。患者EHR数据由加班访问序列组成,其中每次访问包含多个医疗代码,包括诊断,药物和程序代码。此任务最重要的挑战是对连续EHR数据的时间性和高维建模并解释预测结果。现有工作通过使用递归神经网络(RNN)来建模EHR数据并利用简单的注意力机制来解释结果来解决此问题。然而,基于RNN的方法存在以下问题:当序列的长度较大时,RNN的性能下降,并且当前基于RNN的方法忽略了后续访问之间的关系。为了解决这些问题,我们提出了{\ sf Dipole},这是一种端到端,简单而健壮的模型,用于预测患者的未来健康信息。偶极子使用双向递归神经网络来记住过去访问和未来访问的所有信息,并且引入了三种注意力机制来测量不同访问的相关性以进行预测。借助注意力机制,Dipole可以有效地解释预测结果。偶极子还使我们能够解释所学的医学代码表示形式,这些表示形式得到医学专家的肯定。在两个真实世界的EHR数据集上的实验结果表明,与最新的诊断预测方法相比,所提出的偶极子可以大大提高预测准确性,并提供具有临床意义的解释。
展开▼